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Abstract:- In performing fast arithmetic functions, Carry select adder (CSLA) is one of used in many data 
processing processors to perform fast arithmetic functions. Adders are the basic building blocks in digital 
integrated circuit based designs. Ripple carry adders are slowest adders as every full adder must wait till the 
carry is generated from previous full adder. CSLA (SQRT CSLA) architecture have been developed and 
compared with the regular SQRT CSLA architecture. The proposed design has reduced area, power and delay as 
compared with the regular SQRT CSLA. This work evaluates the performance of the proposed designs in terms 
of delay, area, power. The result analysis shows that the proposed CSLA structure is better than the regular 
SQRT CSLA. 
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I. INTRODUCTION 

In VLSI system design the design of area and power efficient high speed logic systems are most 
essential. In digital adders, the speed of addition is limited by the time required to propagate a carry through the 
adder. The sum for each bit position in an elementary adder is generated sequentially only after the previous bit 
position has been summed and a carry propagated into the next position. 

The CSLA is used in many systems to overcome the problem of carry propagation delay by independently 
generating multiple carries and then select a carry to generate the sum. But the CSLA is not area efficient 
because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by considering 
carry input cin = and cin=l, then the multiplexers are used to get final sum and carry are used. 

The Binary to Excess- 1 converter (BEC) is used instead of RCA with Cin = 1 in the regular CSLA to 
achieve lower area and power consumption. The main advantage of this BEC logic comes from the lesser 
number of logic gates than Full Adder (FA) structure. 

II. CALCUATION OF DELAY AND AREA OF THE BASIC ADDER BLOCKS 

The AND, OR and INVERTER (AOI) implementation of XOR gate is shown in fig. 1 . The operations 
of gates between the dotted lines are performing the operations in parallel and the numeric representation of 
each gate indicates the delay. 
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Fig 2: 4 bit BEC evaluation of XOR gate 
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Contributed by that gate. The delay and area evaluation methodology considers all gates to be made up 
of AND, OR, and INVERTER, each having delay equal to 1 unit and area equal to 1 unit. We then add up the 
number of gates in the longest path of a logic block that contributes to the maximum delay. The area evaluation 
is done by counting the total number of AOI gates required for each logic block. Based on this approach, the 
CSLA adder blocks of 2: 1 mux, Half Adder (HA), and FA are evaluated and listed in Table I. 

III. BINARY TO EXCESS-1 CONVERTER 

To reduce the area and power consumption Binary Excess- 1 converter instead of RCA with Cin = 1 . 
This is the main concept of the paper,so as to reduce dealy compared to regular SQRT CSLA. To replace the n- 
bit RCA, an n+1 bit BEC is required. A structured and the function table of a 4-b BEC are shown in fig 2 and 
table II, respectively. 

Fig3 illustrates how the basic function of the CSLA is obtained by using the 4-bit BEC together with 
the mux. One input of the 8:4 mux gets as it input (B3,B2,Bl,and BO) and another input of the mux is the BEC 
output. This produces the two possible partial results in parallel and the mux is used to select either the BEC 
output or the direct inputs according to the control signal cin. The Boolean expressions of the 4-bit BEC is listed 

as 

XO = ~B0 

XI =B0 A B1 

X2 = B2 A (B0&B1) 

X3 = B3 A (B0&B1 &B2) 

Tablel delay and area count of the basic blocks of CSLA 
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IV. DELAY AND AREA EVALUATION METHODOLOGY OF REGULAR 16-B SQRT 

CSLA 

The structure of the 16-b regular SQRT CSLA is shown in fig 4. It has five groups of different size 
RCA. The delay and area evaluation of each group are shown in fig 6, in which the numerical specify the delay 
values, e.g., sum2 requires 10 gate delays. The steps leading to the evaluation are as follows. 

1) The group2 [in fig 6(a)] has two sets of 2- b RCA ,based on the consideration of delay values of table I. 
the arrival time of selection input cl [time(t) = 7] of 6:3 mux is earlier than s3[t = 8] and later than s2[t = 6]. 
Thus, sum3[t = 1 1] is summation S3 and mux[t = 3] and sum2[t = 10] is summation of cl and mux. 

2) Other than group2, the arrival time of mux selection input is always greater than the RCA's. thus the 
delay of group3 to group5 is determined , respectively as follows: arrival time of data outputs from the 
{c6, sum[6 : 4] } = c3[t = 10] + mux 
{clO, sum[10:7] } = c6[t= 13] + mux 



2 



Low Power, Area And Delay Efficient Carry Select Adder Using. 



{cout,sum[15:ll]} = clO[t =16] + mux 

3) The one set of 2-b RCA in group2 has 2 FA for C and the other set has 1 FA and 1HA for Cin = O.Based 
on the area count of table I, total number of gate counts in group2 of table I,the total number of gate counts in 
group2 is determined as follows: 

Gate count = 57 (FA + HA + MUX) 

FA = 39(3*13) 

HA = 6(1*6) 

Mux = 12(3*4) 
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Fig.3: 4 bit BEC with 8:4 mux 
Table 2 function table of 4 b BEC 
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Similarly, the estimated maximum delay and area of the other groups in the regular SQRT CSLA are 
evaluated and listed in table3. 
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Fig 4: REGULAR 16 bit SQRT CSLA 
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Fig 5: PROPOSED 16 bit SQRT CSLA 
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Table 3 Delay a nd Area count of REGULAR SQR T CSLA Groups 



Groups 


Delay 


Area 


Group 1 


7 


26 


Group2 


11 


57 


Group3 


13 


87 


Group4 


16 


117 


Group5 


19 


147 



Table 4 Delay and Area count of PROPOSED SQRT CSLA Groups 



Groups 


Delay 


Area 


Group 1 


7 


26 


Group2 


13 


43 


Group3 


16 


61 


Group4 


19 


84 


Group5 


22 


107 



V. METHOD PROPOSED FOR 16 BIT CSLA BASED ON THE BEC-1 CONVERTER 

The structure of the proposed 16-b SQRT CSLA using BEC for RCA with Cin = 1 to optimize the area 
and power is shown in fig 5. We again split the structures into five groups. Tdelay and area evaluation of each 
group are shown in fig 7. 

1) The group2 [in fig 7(a) has one 2-b RCA which has 1FA and 1HA for cin = 0. Instead of another 2-b 
RCA with cin = 1 a 3-b BEC is used which adds one to the output from 2-b RCA. Based on the consideration of 
delay values of table I,the arrival time of selection input cl[time(t) = 7] of 6:3 mux is earlier than the s3[t = 9] 
and c3[t = 10] and later than the s2[t = 4]. Thus, the sum3 and final c3[t = 10] and later than the s2[t = 4]. Thus, 
the sum3 and final c3(output from mux)are depending on s3 and mux and partial c3(input to mux) and mux 
respectively. The sum2 depends on cl and mux. 

FA= 13(1 * 13) inputs from the BEC's. thus , the delay of the remaining groups depends on the arrival time of 
mux selection input and the mux delay. 

For the remaining groups the arrival time of mux selection input is always greater than the arrival time of data 

2) The area count of group2 is determined as follows: 
Gate cont = 43(FA+ HA + Mux + BEC) 

FA = 13(1 * 13) 
HA = 6(1 * 6) 
AND = 1 
XOR= 10(2*5) 
Mux = 12(3*4) 
NOT= 1 

3) Similarly, the estimated maximum delay and area of the other groups of the PROPOSED SQRT CSLA 
are evaluated and listed in table 4. Comparing tables 3 and 4, it is clear that the proposed system is better in 
delay and area, simultaneously in power. 
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Fig 6 Delay and area evaluation of REGULAR SQRT CSLA: 
(a)group2, (b)group3, (c)group4 and (d)group5 
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Fig 7 Delay and area evaluation of PROPOSED SQRT CSLA REGULAR and PROPOSED SQRT CSLA 

in (a)group 2, (b)group3, (c)group4, (d)group5 
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Fig 8 : Graphical charts representing area 
Table 3 Comparison of REGULAR and PROPOSED SQRT CSLA 



Word Sin 


Adder 


Utl.u Ills! 


Area < L,in ; i 


Leakage 


Switching 


Total 


PllWil ■ 1 J/..V 


ai.j !)>■:.:■.. 










Power 


power 


power" 


PtwhicK i 1 ■ i 


j'...Ju.:( 1" ■ 


S-bil 


Regular CSLA 


1.719 






10!.'? 










Modified CSLA 


1.956 


iw* 


0.006 


vi : 


188.4 


368 J 


1752.4 


l<W till 


Rt^ular CSLA 


2.77! 


2272 


0.01 7 


263 7 


527 J 


1463 & 


• i .-■ 




MwJirittJ CSLA 


3-. 048 


1929 


0.01 3 


235 9 


471 8 


1 4 1 s n 


5879 6 


_i:-bii 


Regular CSI -A 


5 137 


47*3 


0036 


563*. 


1127 J 


5790 9 






Modified! si \ 


5.4*2 




0027 


454 9 






2IS45 7 


M-r.il 


Kefiular CSLA 


o 174 




(1075 


12124 


2425.0 


22246.9 


90969.3 




Mod 1 fled t. M A 


9 519 


sisi 


057 


1O2S.0 




I9SI4 9 


77893 9 



VII SIMULATION RESULTS 
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Fig 9 simulation result for REGULAR 16 b SQRT CSLA 
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VIII. CONCLUSION 

When the comparison between the SQRT CSLA and modified SQRT CSLA is considered, there is the 
difference in simple approach is proposed in this paper to reduce the area, delay and power of SQRT CSLA 
architecture. The reduced number of gates of this work offers the great advantage in the reduction of area, delay 
and also the total power. The compared results show that the proposed SQRT CSLA has delay, area and power 
are significantly reduced. 

Area and delay values of REGULAR SQRT CSLA and PROPOSED SQRT CSLA are given below, 
which are evaluated based on the Xilinx program of REGULAR SQRT and PROPOSED SQRT CSLA. 
1) Area of REGULAR SQRT CSLA 



Number of Slices 
Number of 4 input LUTs 
Number of IOs 
Number of bonded IOBs 



25 out of 
45 out of 
50 

50 out of 



4656 
9312 



2% 
2% 



232 21% 



Delay of SQRT CSLA is 20.215ns 




Fig 10 simulation result for PROPOSED 16 b SQRT CSLA 
2) Area of PROPOSED SQRT CSLA 
Number of Slices : 23 out of 4656 2% 

Number of 4 input LUTs : 40 out of 9312 2% 

Number of IOs : 50 

Number of bonded IOBs :50 out of 232 21% 

Delay of MODIFIED SQRT CSLA is 18.867ns 
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